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Abstract. The destriping technique is a viable tool for removing different kinds of systematic effects in CMB- 
related experiments. It has already been proven to work for gain instabilities that produce the so-called 1// noise 
and periodic fluctuations due to e.g. thermal instability. Both effects, when coupled to the observing strategy, 
result in stripes on the observed sky region. Here we present a maximum-likelihood approach to this type of 
technique and provide also a useful generalization. As a working case we consider a data set similar to what the 
Planck satellite will produce in its Low Frequency Instrument (LFI). We compare our method to those presented 
in the literature and find some improvement in performance. Our approach is also more general and allows for 
different base functions to be used when fitting the systematic effect under consideration. We study the effect of 
increasing the number of these base functions on the quality of signal cleaning and reconstruction. This study is 
related to Planck LFI activities. 
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1. Introduction 

One of the major goals of cosmology is to determine the 
cosmological parameters which describe the structure and 
evolution of the Universe. In this respect CMB observa- 
tions are a powerful tool directly probing the early phases 
of the Universe. Recent results from the WMAP satellite 
(Bennett et al. 12003) show that high accuracy in such mea- 
surements can be achieved with an optimal choice of ob- 
serving site (the second Lagrange point of the Sun-Earth 
system, L2, for good thermal and environmental stabil- 
ity), careful instrument design and control of systematic 
effects. The last point is related to the observing strategy 
adopted, which should be as redundant as possible, with 
different measurements of the same sky region with dif- 
ferent detectors and on different time scales in order to 
properly control systematics. 

Future space missions like Planck^ which are de- 
signed to have a signal-to-noise ratio of the order of few 
tens (far larger than WMAP), require control of system- 
atic effects at the fxK level. In this respect several tech- 
niques have been developed to treat systematics, to detect 
and remove them in the best possible way. Burigana et al. 
ifTMTji . Delabrouille l(T^Mjl and Maino et al. ifT^IMT^ 
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have considered, in the context of the Planck mission, a 
simple destriping algorithm to remove systematics like the 
1//" noise. Mennella et al. (2002) have instead considered 
destriping when dealing with periodic fluctuations such as 
those induced by thermal instabilities. 

Destriping methods work on time-ordered data (TOD) 
and produce TOD cleaned of systematics. When TOD is 
cleaned it is possible to co-add observations on the same 
region (pixel) of the sky to obtain a sky map which gives a 
visual impression of the data. Although it is non-optimal, 
in the sense that it would not necessarily produce the map 
with the minimum possible variance as instead provided 
by the Generalized Least Square solution of the map- 
making problem (see e.g. Natoli et al. I2001|) . it provides 
a fast and accurate map-making algorithm. In addition, 
the analysis of TOD cleaned of systematics is useful for 
several applications relevant for the Planck data analy- 
sis (e.g. in-flight main beam reconstruction (Burigana et 
al. I2002|l and calibration (Bersanelli et al. 1997. Piat et 
al. 120031 Cappellini et al. l2003|l . time series analysis) and 
for the scientific exploitation of Planck data (e.g. source 
variability studies (Terenzi et al. I2()()2|l ). 

In this paper we consider the destriping technique in 
the light of maximum-likelihood analysis and present a 
general formulation of the destriping technique. We re- 
strict our analysis to 1// noise fluctuations. They produce 
noise which is strongly correlated in time and, when cou- 
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pled with the observing strategy, will lead to stripes in the 
final maps that would alter the signal statistics. This is of 
extreme importance for the CMB which is expected to be 
a Gaussian random field. 

The basic idea in destriping is to model the noise in the 
TOD by a linear combination of simple arithmetic func- 
tions, such as polynomials or Fourier components. The 
amplitudes of these base functions are determined taking 
advantage of the redundancy of the scanning strategy for 
which the same points on the sky are monitored several 
times during the mission. In its simplest form destriping 
involves fitting uniform baselines, i.e. one baseline for each 
elementary scanning period. In order to improve the ac- 
curacy of the method, we present here the possibility of 
fitting several components (base functions). 

The destriping method of Burigana et al. H1997f) and 
Maino et al. fl999, 2002) differs from the destriping 
method of Delabrouillc (1998) in the weights they assign 
to different map pixels based on the number of measure- 
ments falling on that pixel. Our maximum-likelihood anal- 
ysis presented in this paper leads to a weighting scheme 
that differs from both of these. Therefore we compare re- 
sults obtained from all these three methods. 

The paper is organized as follows. In Sect.|2we present 
the maximum-likelihood approach to the destriping tech- 
nique, in Sect. 121 we apply it in the case of uniform base- 
lines, and in Sect. ^ we generalize the discussion to arbi- 
trary base functions. We present our conclusions in Sect.[Sl 

2. Destriping - maximum likelihood approach 

2.1. Maximum likelihood analysis 

In the following we present a maximum-likelihood based 
approach to the destriping problem. We assume that data 
produced by a generic detector at a given time t could be 
written as: 



yt 



(1) 



where mp is the sky signal, assumed to be pixelized, Ptp 
is the pointing matrix, p is the pixel index, nt.corr is the 
correlated noise component while nt is the white noise 
component. The variance of the white noise component is 
represented by a diagonal matrix C„ in the time domain. 
Eq.JQ) could be written in vector form as: 



Pm + rip 



(2) 



We model the correlated noise component of the TOD 
as follows. The TOD is divided in to elementary scanning 
periods, which we shall here call "rings" (as appropriate 
for the Planck scanning strategy). For each ring j we 
define a constant offset aj, so that 



yt 



Ptpf^p 



nt. (3) 

Here Ftj equals unity if point t lies on ring j. We write 
this in matrix form as 



y = Pm -|- Fa -|- n. 



(4) 



We treat both the map and the correlated noise compo- 
nent as deterministic. With these assumptions, we obtain 
the likelihood function 

x' = (y-Fa-Pm)^C-i(y-Fa-Pm). (5) 

If Nt is the length (the number of samples) of the TOD 
stream, A^pix is the number of pixels in the map, and Na 
is the number of unknown amplitudes, then the sizes of 
the matrices are: [F] = {Nt.Na), [P] = {Nt,Npi^), [C„] = 
{Nt,Nt). 

We now want to find the maximum likelihood solution 
for a. We need to minimize the function in Eq. jSjl with 
respect to both of the unknown variables m and a. First 
we find the minimum with respect to m, 

V„x' = -2P^C-i(y-Fa-Pm)=0. (6) 

From this we can solve the map m, 

m=(P^C~ip)-ip^C;:i(y-Fa). (7) 
Substituting Eq. Q back into Eq. © we obtain 
= (y - Fa)^Z^C-iZ(y - Fa), (8) 

where 

Z = I-P(P^C-ip)-ip^C-i. (9) 

Here I denotes the unit matrix. 

The next step would be to minimize with respect 
to a, 

Vax' = -2F^C-iZ(y-Fa)=0. (10) 
The minimum is given by 

F^C-^ZFa = F'^C-^Zy. (11) 

Here we have used the property Z'^C^^Z = C^^Z. 

We assume from now on C„ = diag{cr^). With this 
simplification, the minimum of ^ is given by 

F^ZFa = F^Zy. (12) 

where 

Z = I-P(P^P)-ip'^. (13) 

The effect of Z acting on a TOD is to subtract from each 
sample the average of all samples hitting the same pixel. 
The solution to Eq. H12(l is not unique. We may add an 
arbitrary constant to a without changing the value of x^- 
This is equivalent to varying the monopole component of 
the CMB map, and is irrelevant for anisotropy measure- 
ments. To remove this ambiguity we require that the sum 
of baselines is zero, a^l = 0. Here 1 is a vector with all el- 
ements equal to 1. This is equivalent to adding term ll'^a 
to the left-hand side of Ea. l(T^ . 
The solution is now given by 

a= [F'^ZF-hll^]"^F^Zy. (14) 

Matrix F^ZF + 11^, unlike F^ZF, is non-singular, pro- 
vided that there are enough intersection points between 
the rings. 11"^ denotes a matrix with all elements equal to 
one. When the amplitude vector a has been determined, 
the CMB map can be computed according to Eq. 
These equations are the main theoretical result of this 
paper. 
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2.2. Pointing and beam shape 

The matrix P spreads the map m into TOD. In princi- 
ple, the beam shape and profile can be incorporated in P. 
The elements of P then determine the weights that diflter- 
ent pixels contribute to a given measurement. In this way 
beams with arbitrary shapes and profiles can be treated. 
In this case the matrix P-^P is non-diagonal. Due to its 
large size, in Planck type of missions its inversion would 
present a major computational burden. 

A simpler approach is to consider each sample in the 
TOD to represent the temperature of the pixel at the 
center of the beam. Then P takes a particularly simple 
form, consisting of ones and zeros for a full-power mea- 
surement like Planck. Each row contains one non-zero 
element identifying the pixel on which the corresponding 
measurement falls. Matrix P'^P becomes diagonal, the di- 
agonal elements giving the number of hits on each pixel. 
We follow this approach from here on. 

2.3. Comparison with earlier work 

Equation ((SJ can be put into form 

where index p labels pixels, i,j scanning rings, and fc,Z 
samples on a given ring. A combined index ik or jl iden- 
tifies a measurement. We define the symbol so that 
(F^j, = 1 if measurement ik falls into pixel p, otherwise 
d^f. = 0. Due to the factors dfj, and d^j in the numera- 
tor of Eq. (|15|l only those pixels p contribute to the pixel 
sum which lie on two or more scanning rings, and the sum 
Sifc jl equal to 2 times the sum over all pairs of mea- 
surements falling on pixel p. 

Eq. (|15|) can be compared to Eq. (2) of Maino et al. 
or to Eq. (10) of Burigana et al. lfTW7|l . The for- 
mulae differ in that in Eq. (|15l) has in the denominator 
the term np — J^ik'^ik' which gives the total number of 
hits in pixel p. 

Delabrouille H1998|l gives the general form 

^w{p,ik,jl){yik-yji-ai+aj)'^ (16) 

p(isky pairs 

for the function S to be minimized. Here the second sum 
refers to all pairs (ik, jl) that can be formed of the mea- 
surements falling onto pixel p, and u; is a weight func- 
tion to be chosen. Based on the fact that pixel p con- 
tributes np{np — l)/2 pairs, Delabrouille suggests choosing 
w oc i/{np — 1). The result of Maino et al. 1)1999(1 corre- 
sponds to w = const, and our new result, Eq. ifT^ . to 

w = l/Up. 

2.4. Circular scanning 

In the nominal scanning strategy of Planck the spin axis 
follows the ecliptic plane. The spin axis is kept anti-solar 



by repointing it by 2.5' every hour. The spacecraft rotates 
around the spin axis at a rate of 1 rpm. During one hour 
Planck scans the same circle on the sky 60 times. As 
the sky signal is almost time-independent, the data can be 
coadded to reduce the length of the data stream by a factor 
of 60. We call this set of 60 circles, and the corresponding 
segment of the coadded TOD a "ring" . The crossing points 
of the rings are important calibration points, which allow 
for the removal of the correlated noise component from 
the TOD. 

The opening angle of the scanning circle varies between 
80-90 degrees, depending on the location of the detector on 
the focal plane. The sampling frequency for the 100 GHz 
LFI receiver is 108.3 Hz. The instrument then collects 6498 
temperature values, or "samples" , at each rotation of the 
spacecraft, corresponding roughly to a 3' shift between 
successive samples. A total of 8766 rings builds up one 
year of observations. 

In reality, the angular velocity of the rotation does 
not stay exactly constant, especially immediately after re- 
pointing, and the samples from different circles of the same 
ring are shifted in position. This will probably require dis- 
carding the first few circles of each ring, and resampling or 
phase binning the rest before performing the coaddition. 
During the ring the spin axis of the satellite will follow a 
nutation ellipse with maximum amplitude of 1.5' at the 
end of the nutation damping phase (van Leeuwen et al. 
I2002|) . This may degrade slightly the performance of de- 
striping. We ignore these complications in this paper. 

The destriping technique applies particularly well to 
a PLANCK-like measurement pattern resulting from the 
coadding of scanning circles into rings which breaks the 
stationarity of the data. The 1//" noise component in 
the coadded TOD is well presented by a piecewise defined 
function, where each piece consists of a linear combination 
of a few base functions. 

3. Uniform baselines 

3.1. Simulation results 

We have carried out simulations of the Planck LFI 100 
GHz detector. The underlying CMB map was created by 
the Synfast code of the HEALPix package^, starting from 
the CMB anisotropy angular power spectrum computed 
with the CMBFAST code^ (see Seljak & Zaldarriaga lTMHl 
and references therein) using the cosmological parameters 
iltot = 1-00, r^A = 0.7, n^h^ = 0.02, h = 0.7, n = 1.00, 
and T = 0.0. We created the input map with HEALPix 
resolution iVgido — 1024 and with a symmetric Gaussian 
beam with a full width at half maximum (FWHM) of 10'. 
We then formed the signal TOD by picking temperatures 
from this map. All our output maps have the resolution 
parameter A'sidc=512, corresponding to an angular reso- 
lution 7'. 

^ http:/ /www. eso.org/science/healpix 

^ http://physics.nyu.edu/matiasz/CMBFAST/cmbfast.html 
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The scanning pattern corresponds to the nominal 
Planck scanning strategy of the 100 GHz LFI detector 
number 10. The angle between the satellite spin axis and 
the optical axis of the telescope is 85°. The beam center 
is pointing towards (fi*, </>) = (3?737, 126?228). Here 9 is 
the angle from the optical axis and is an angle counted 
clockwise from the axis pointing from the center of the 
focal plane towards the satellite spin axis. We assumed no 
spin axis precession. Our simulated data set consists of 
5040 scanning rings, corresponding to 7 months of mea- 
surement time. The TOD stream contains 6498 samples 
on each ring, corresponding to a sampling frequency of 
fs = 108.3 Hz. The sky coverage is 98.5%. 

In our simulations we have assumed a symmetric 
Gaussian beam, and convolved the input map with the 
beam. 

The rings cross at points which are mostly concen- 
trated near the ecliptic poles. We count as crossing points 
all points where two measurements on different rings fall 
on the same pixel. Since our pixel resolution (7') exceeds 
the repointing angle of the spin axis (2.5'), the crossing 
points include cases where two nearby rings pass paral- 
lel through the same pixel, without actually crossing each 
other. 

We used the Stochastic Differential Equation (SDE) 
technique to create the instrument noise stream, which 
we added to the signal TOD.'^ We generated noise with 
the power spectrum 

^(/)=(l + j)^, (/>/n.i„) (17) 

with parameters a ~ 4800/iK (CMB temperature scale), 
/k = O.lHz, and fnun — 10~^Hz. The parameter /k is 
called the knee frequency. The noise level 4800^K corre- 
sponds to the estimated white noise level of one lOOGHz 
LFI detector. 

Fig. n shows a five-hour section of a coadded noise 
TOD and the baselines fitted to it. Fig.|21shows the base- 
line distribution from a set of 10 simulated 7-month noise 
TODs. 

There was no foreground included in the simulations 
presented in this paper, but we have also verified our de- 
striping codes on simulated data sets with foreground. We 
found that the foreground has an insignificant impact on 
the baselines determined by the destriping, in agreement 
with the discussion in Maino et al. H2UU2|I . The quality of 
destriping is also almost independent of the impact of an- 
other class of instrumental systematic effects, main beam 
distortions and straylight, as the temperature differences 
at crossing pixels are dominated by the noise and only 

Simulation Software is part of the Level S of the 
Planck DPCs and is available for Planck collaboration at 
:/ /planck.mpa-garching. mpg.de 
^ SDE is one of the two methods in the Planck Level S 
pipeline for producing simulated instrument noise. The method 
was implemented by B. Wandelt and K. Gorski and modified 
by E. Keihanen. 
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Fig. 1. A 5 hour segment of the noise TOD after coadding 
(grey) and the baselines (black) fitted to it. For this figure 
the 5-hour average is subtracted also, to center the figure 
at /iK. The baselines were obtained using the weight 
function w = l/up. The difference between baselines ob- 
tained with different methods would be too small to show 
up clearly in this figure. 

0.05 1 1 1 1 1 1 1 
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Fig. 2. The histogram (in bins of 50 /iK) of the base- 
lines from a set of 10 simulated 7-month noise TODs. The 
three curves correspond to two different weight functions 
(see Sect. I3.2|l used to determine the baselines: w = l/up 
(thick black) and w — I (thin black), and to the reference 
baselines, i.e., noise averages over the one hour ring (grey). 

minimally affected by the spurious signals (w jiK) intro- 
duced by optical distortions (see e.g. Burigana et al. 2001). 

Fig. |31 shows the input Ci spectrum and the spec- 
trum derived from the simulated TOD after destriping. 
We used the Anafast code of the HEALPix package to 
compute the Ce spectrum of the destripcd map. We sub- 
tracted from the derived spectrum an estimate of the noise 
level Cnoise = 0.197/iK^ (estimated as the average of d 
over £ = 980 . . . 1000) and corrected the spectrum for the 
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beam shape convolution and pixel convolution. The an- 
gular spectrum shown is thus Ci = {Ci — Cnoise) / {B^ hf) , 
where Be = exp(-cr^£(£ + l)/2), with at = IO'/a/sT^, 
is the beam convolution function corresponding to the as- 
sumed beam width of 10' (FWHM), and hi is the pixel 
convolution function (provided by the HEALPix package) . 



, X 10 




1000 



Fig. 3. The input Ce spectrum and the spectrum com- 
puted from the CMB map. The latter has been corrected 
for beam and pixel convolution, and for white noise level. 
Destriping was done fitting uniform baselines only. 
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Fig. 4. Same as Fig.|21 but for noise level lowered by fac- 
tor 1/V24. This simulates the effect of combining 24 de- 
tectors. 

Fig. 0] shows the same for a noise level reduced by 
the factor \/24, corresponding to the combination of 24 
detectors. Here the subtracted noise level was Cnoise — 
0.197AiKV24 = 0.0082/iK2. 



Note that Figs. |3| and 01 are for illustration only, as this 
paper does not address the full CMB Cg estimation prob- 
lem, and thus we have just used the above crude estimate 

fol" ^noisc ■ 

3.2. Comparison of different weighting schemes: maps 
and angular power spectra 

We have written a destriping code which allows us to com- 
pare the different weight functions discussed in Sect. 12.31 
We use the Cholesky decomposition technique to solve the 
set of linear equations. 

We have chosen the root-mean-square (rms) value of 
the residual noise map (see below) pixels as a figure of 
merit that we use to compare different destriping methods. 
This map rms is related to the Ce spectrum through the 
relation 



E 



2e + i 

47r/sky 



Ce 



(18) 



The map rms squared is thus a weighted sum of the an- 
gular spectrum, with high weight on high multipoles. The 
sky coverage /sky = 0.985 enters here because we have 
computed our rms values over the visited pixels only. 
When computing Ce spectra, we have set T = in the 
remaining pixels. 

We compute the residual noise map by taking the 
difference between the destriped map and the noise-free 
reference map and subtracting the monopole component. 
Note that because of the incomplete sky coverage, remov- 
ing the monopole affects the Ce spectrum at all £ (not 
only i = 0). The reference map is computed by coadding 
the pure signal TOD into a map of resolution A^sido=512. 
The expected contribution from white noise to the residual 
map rms is 220.95 /iK. 

While the rms of the residual noise map is a natural 
measure of the CMB map quality, the main scientific in- 
terest is perhaps not in the CMB map itself, but rather in 
its angular power spectrum Ce. It is therefore of interest to 
see the impact of the destriping methods on the different 
parts of the Ce spectrum. The map rms is dominated by 
the high £ part, and does not reveal the difference in per- 
formance of the various methods in the low £ part. Thus 
we have computed the angular power spectra Ce of the 
residual noise maps. 

Because of the random nature of the noise, the result of 
a comparison between methods may vary from one noise 
realization to another. We therefore performed destriping 
10 times, with different realizations of instrument noise. 
The underlying CMB map was kept the same. We then 
used the average of the 10 residual map rms values and 
the residual noise map Ce spectra to compare the methods. 
We also calculated the standard deviation (std) of the 10 
map rms and Ce values, to see whether the differences 
between the methods were statistically significant. Thus 
this average rms approximates the expectation value for 
the rms with an accuracy of std/-\/lO- However, the std 
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Table 1. Average (avg) rms and std of rms of the residual 
noise map for different weight functions. The average and std 
are taken over 10 noise realizations. The corresponding Ci spec- 
tra are shown in Fig. |3 The last line gives the reference rms, 
which would be reached if one could determine the baselines 
exactly. The differences between the rms are significant only if 
they are larger than the std. We show extra digits for the rms 
in Tables 1 and 2 to show the systematic (but insignificant) 
difference (see text) between the w = l/ijip — 1) and w = l/up 
cases. 



Weif 


;ht 


avg rms//iK 


std of vms/fiK 


■w — 


1 


225.1619 


0.072 


w — 


1/K - 1) 


224.4332 


0.073 


w — 


l/Up 


224.4443 


0.073 


ref. 




224.1170 


0.075 



Table 2. Average rms and std of rms of the residual noise 
map, for a simplified noise model. The noise consists of uniform 
baselines + white noise. 



Weif 


;ht 


avg rms//iK 


std of rms/fiK 


w — 


1 


221.4816 


0.069 


■w — 


1/K - 1) 


221.1041 


0.072 


w — 


l/Up 


221.1039 


0.072 


ref. 




220.9593 


0.073 



itself tells us how much we can expect the residual map 
rms for a single realization to deviate from this average 
value. 

Table 1 presents a comparison between different weight 
functions discussed in Sect. 12.31 The corresponding Ce 
spectra are shown in Fig. |5| Since from our simulations 
we have the noise streams available separately, we also 
computed reference baselines as the average of the noise 
stream over each ring. This reference baseline can be 
thought of as the "true" baseline of the noise. For compar- 
ison, we then subtracted the reference baselines from the 
TOD and computed the rms of the resulting map. This 
represents an ideal situation, where we could determine 
the baselines exactly. The reference rms is given on the 
last line of Table 1. Actual residual noise rms values are 
always larger. 

Weight functions w = l/up and w = l/(fip — 1) in 
Eq. (|16|l give similar results, due to the fact that for most 
pixels Up ^ \. Weight w — Xjup suggested by our max- 
imum likelihood analysis is clearly superior to w — 1. 
However, the weight w = 1/ (rip — 1) gives an even smaller 
rms, although the difference is very small. The difference 
of the rms between w — \ and w = Xjup is significant 
because the difference is about 10 times larger than the 
respective std of the rms. Although the difference between 
w = l/rip and w = l/{np — 1) is much less than the std 
between different noise realizations, it was in the same 
direction in each realization. Note that we were able to 
measure this small difference only because we used the 
same set of random seeds for all weighting schemes. 




,3 



Fig. 5. The Ci spectrum of the residual noise map, for 
different choices of the weight function w. The spectra 
are averages over 10 realizations of noise. Only uniform 
baselines are fitted. The lower solid line corresponds to 
the choice w = l/up for the weight function in Eq. H16|l 
and the dashed line to w = const. The difference between 
weight functions w — l/up and w — l/(np — 1) is too small 
to show on this plot. The difference between them is plot- 
ted in Fig. (lower panel). The upper solid (gray) line 
shows the spectrum of a naive coadded map (no destrip- 
ing). The dash-dotted (gray) line shows the ideal reference 
spectrum, computed by removing the reference baselines. 
The corresponding map rms values are shown in Table 
1. The dotted line shows the theoretical white noise level 
0.192/zK2. 

Since the difference between the weight functions w — 
l/up and w — 1/ {up — 1) is so small that it does not show 
up in Fig. |5l we show just the differences in Fig. |S| 

It is well known that maximum-likelihood analysis 
should provide the minimum-variance solution. Therefore 
it may at first seem surprising that the maximum- 
likelihood based weight function did not give the best re- 
sults. However, the maximum-likelihood solution is the 
optimal one only if the model used corresponds to re- 
ality. Here we have modelled the 1// noise component 
in the TOD by a uniform baseline, which is a simplified 
model. Further, we have assumed that the baselines are in- 
dependent from ring to ring. The reason for the maximum- 
likelihood solution not giving the best result is that the 
noise model used in the analysis does not exactly corre- 
spond to the actual noise properties. 

To verify this, we re-generated our input noise in a 
way that better corresponds to the model assumed in the 
analysis. We generated the 1// noise in the usual way, but 
then, for each ring, we took the average over the ring, and 
replaced the original 1// contribution to the ring with this 
average value, on top of which we added white noise. This 
way we obtained noise which still has a realistic correla- 
tion between scanning rings, but consists of baselines -|- 
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Fig. 6. Differences between the Cg spectra shown in Fig.jS] 
a) The difference Ci{w = const) — Ct{'w = l/up). b) The 
difference Ce{w = l/(rip — 1)) — Ciiw = l/Up). Note the 
much expanded vertical scale in Fig. 03. 
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Fig. 7. Three examples of individual Ci spectra, from dif- 
ferent noise realizations, after destriping. Only uniform 
baselines are fitted. The average of 10 such spectra is 
shown in Fig. [S] (the solid line there) . 



white noise only. The results are shown in Table 2. Now 
the maximum-likelihood based weight function gives the 
smallest variance map. 

Thus it seems that the slightly better performance of 
the Delabrouille weighting scheme {w — ^/{rip — 1)) is 
related to the effect of that part of the correlated noise 
which deviates from uniform baselines. 

The scatter in the individual residual noise map Ci val- 
ues from one noise realization to another was larger than 
the difference between the methods. (See Fig. |7| for the 

spectra from the first three noise realizations, using 
our weighting scheme, w = l/rip.) The difference between 



w — 1 and w = 1 jup becomes however statistically signif- 
icant when the Cn are binned into larger bins. 

4. Increasing the number of base functions 

As shown by Delabrouille H1998|l and Maino et al. lfMH?l 
I2()()2|l . a simple model with uniform baselines and white 
noise models quite well the coaddcd noise stream. One can 
try to further improve the performance of destriping by in- 
troducing more base functions. Delabrouille fTMB| found 
that the addition of a small number of base functions im- 
proved the performance of destriping, whereas Maino et 
al. H2(JU2|I found no significant improvement. The differ- 
ence between these results could be due to the different 
noise spectra considered: 1// noise for Maino et al. H2002|l 
and for Delabrouille ifBH^Hjl . 

We generalize the discussion of Sect. 12 to include arbi- 
trary base functions. We model the correlated noise com- 
ponent of the TOD by a linear combination of base func- 
tions as ricorr = Fa. Here F is a matrix, whose columns 
contain the base functions, and a is a vector containing 
their (unknown) amplitudes. It is convenient to select an 
orthogonal set of the base functions, so that F-'^F is diag- 
onal. Eqs. 1^- 1)13(1 hold as such for general baselines. 

We have studied two sets of base functions: Fourier 
components and Legendre polynomials. Both form an or- 
thogonal set. 

4.1. Simulation results 

The solution of the general destriping problem involves 
the solution of a large linear system of equations 

F^ZFa = F^Zy. (19) 

Matrix A = F-^ZF becomes very large if several base func- 
tions are fitted. We use the iterative conjugate gradient 
method (see, e.g., Press et al. I1992f) to solve the system. 
The conjugate gradient method only requires multiplica- 
tion by matrix A. That can easily be done algorithmically, 
without actually storing the full matrix A at any one time. 
The conjugate gradient method allows us to perform the 
destriping in a relatively small memory space. 

Note that we do not need to add the term 11"^, since 
the conjugate gradient method has the property that, 
when solving system Ax — b, it automatically finds the 
solution for which x^x = 0, if iteration is started with 
x = 0, and Axq — and b-^xo — 0. (The amplitude vec- 
tor which gives unit amplitudes to the uniform baselines 
and zero amplitudes to other baselines is an example of 
such a vector Xq, so the average of the uniform baselines 
is set to zero. The case of other possible vectors in the null 
space of A is discussed further below.) 

We compare four sets of functions: 

1. ("Un.") Uniform baselines only. 

2. ("Fl") Uniform baseline + first Fourier frequency, 
which gives three components for each ring: constant, 
sin(27r/sc0, and cos(27r/sct) {q = 0, 1, 2). Here U = 1/(60 
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s) is the scanning frequency. 

3. ("LI") Uniform baseline + 1st (linear) Legendre poly- 
nomial. 

4. ("L2") Uniform baseline + 1st and 2nd Legendre poly- 
nomials. 

The first Legendre polynomials are L{x,0) = 1, 
L{x,l) = X, and L{x,2) = i(3a;2 - 1), for x G [-1,1]. 

Again we averaged the residual noise Ci spectra and 
the residual noise map rms over 10 noise realizations. 

Table 3 and Fig. |H1 present our results of fitting several 
base functions. We find no improvement in the map rms. 

The last column of Table 3 gives a reference rms, which 
was computed as follows. We defined the reference ampli- 
tude vector as slq = (F-^F)~^F^n, where n is the pure 
noise stream (i.e. ricorr -I- n of Eq. Q. We coadded a map 
from the TOD stream, from which we had removed the ref- 
erence base functions, y — Fslq, and computed the rms of 
the residual noise map. Fourier components give the low- 
est reference rms, showing that Fourier components model 
the noise best of our base function sets. However, when we 
fit the data, Fourier components give the poorest results, 
as the first column of Table 3 shows. The worst of the 10 
runs gave a rms of 452 /iK. The code took a long time 
to converge, and the final maps contained a very strong 
and obviously unphysical dipole-like structure. (The large 
std in the case of Fourier components does not mean that 
for some realizations the Fourier components would have 
given a lower rms than the other methods. Rather, the 
distribution was just very skew; a small number of real- 
izations gave a very large rms, but all 10 realizations gave 
an rms larger than the average for any of the other meth- 
ods.) 

The problem with Fourier components is related to 
small eigenvalues of matrix A. If there exists a map m' 
such that 

Pm' = Fa' (20) 

for some a', then a' is an eigenvector of A with zero eigen- 
value. If a is a solution of Eq. (|19() . then a -I- a' is also a 
solution. In other words, if we can produce the same TOD 
stream both as a combination of the base functions, and by 
picking it from a map with our scanning strategy, then it 
is not possible, without further information, to tell if this 
TOD component comes from noise or signal. In practice 
zero eigenvalues are unlikely to happen, but already small 
but non- vanishing eigenvalues cause similar problems. Eq. 
H20|) then holds approximatively. 

The difficulty of fitting Fourier components is related 
to the symmetry of the nominal scanning strategy of 
Planck. Suppose the scanning rings are regular circles 
with centers on the ecliptic plane. If we give all cosine 
and sine components equal amplitudes a and b such that 
the resulting function a cos (f> + b sin (0 runs from to 
27r around the ring) is at maximum at the northernmost 
point and at minimum at the southernmost point (or vice 
versa), and coadd a map from this TOD stream, we ob- 
tain a meridian symmetric map for which relation (|20|l 
holds. This noise component cannot be resolved without 



Table 3. Average rms (in /iK) of the residual noise map, std 
of rms, and reference rms, for different sets of base functions. 
The reference rms is computed from a map from which the 
reference baseline functions are removed. The base functions 
were: Un: uniform baseline, Fl: three Fourier modes, LI (L2): 
Legendre polynomials up to 1st (2nd) order. The two first lines 
('Un.') represent the same destriping methods as the first and 
third lines of Table 1. 



fit 




avg rms//iK 


std of rms//iK 


ref. rms 


Un. {w = 


1) 


225.162 


± 0.072 


224.117 


Un. {w = 


l/Up) 


224.444 


± 0.073 


224.117 


Fl 




264.131 


±70.460 


223.621 


LI 




224.463 


± 0.077 


223.860 


L2 




225.049 


± 0.463 


223.748 



further information on noise properties. The same prob- 
lem also affects fitting 2nd order Legendre polynomials, 
albeit less severely. We would expect that other, less sym- 
metric, scanning strategies would not be as prone to such 
problems. 

We discuss this problem quantitatively in the follow- 
ing. 




Fig. 8. Average d spectrum of the residual noise map, for 
different choices of the base functions. Thick solidlinc: uni- 
form baselines. Thin solid lower line: three Fourier compo- 
nents {q — 0, 1, 2). Dashed and dot-dashed lines; Legendre 
polynomials up to 1st and 2nd order. The upper solid 
(gray) line shows the spectrum of a naive coadded map 
(no destriping). The spectra were averaged over 10 real- 
izations of noise. 



4.2. Eigenvalue analysis of the destriping problem 

Consider the solution of Eq. (|19(l in light of eigenvalue 
analysis. 
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Assume the TOD stream is of the form 



4.3. A practical method 



y = Pmo + Fan + n 



(21) 



where mo is the actual map, &o is the "true" baseline 
vector, and n represents the remaining noise component. 
With these assumptions, Eq. (|19|l becomes 



F^ZFa = F^ZFan + F^Zn. 



(22) 



Let Xi and Ui be the eigenvalues and corresponding 
orthonormal eigenvectors of matrix A = F^ZF. Because 
A is symmetric and non-negative definite, all eigenvalues 
are positive or zero, Xi > 0, and the eigenvectors form a 
complete orthogonal basis. The matrix can be presented 
by its eigenvectors as 



F^ZF = ;^A,u,uf. 



(23) 



We can expand a = J^i '^i'^i ag = J^i "^i ^^'^ 
F^Zn^^CU,. (24) 

i 

We substitute these expansions into (|22|l to find 

a, = + cJX,. (25) 

Here it must be understood that this relation holds for 
components for which Ai > 0. For = one can easily 
show that Ci — and remains undefined. 

Consider then the statistical properties of coefficients 
Ci. Since Ci = uf F"^Zn, we find, assuming that n is white 
and (nn-'") = diag{a^), that (q) = and 



■ik ■ 



Looking back at Eq. j^SJ we observe that 
((ofe - a2)(a, - a°)) = —^ik- 



(26) 



(27) 



We see that if one of the eigenvalues is very small, then 
the inaccuracy in the corresponding component is very 
large. Actually, the vanishing eigenvalues do not pose a 
problem, since the conjugate gradient algorithm always 
sets the corresponding amplitude to zero. Problems are 
caused by moderately small, but non- vanishing, eigenval- 
ues. How small a value must be regarded as zero depends 
on the floating point accuracy of the computer and on 
the convergence criterion one has chosen for the conju- 
gate gradient algorithm. 

We have seen that fitting uniform baselines only al- 
ready gives good results. There is no advantage in trying 
to fit additional components which have a large inaccu- 
racy. Fitting poorly determined components causes more 
error than leaving them out entirely. We therefore aim 
to fit only components that correspond to a large eigen- 
value, and eliminate small eigenvalue components. In the 
following we present a practical method to do this. The 
method presented does not require full determination of 
eigenvalues or eigenvectors of matrix A, which would be 
a computationally expensive task. 



Consider the following equation: 
(F^ZF -I- eF^F)a = F'^Zy 



(28) 



where e is a small positive constant. An eigenvalue analysis 
similar to that presented above shows that the solution of 
Eq. is related to the solution of (fTi^ through 

a'l = a» , \ , (29) 

where X,nax is the largest possible eigenvalue of A. With 
our chosen normalization F^F = diag{nt,) it is equal to 
the number of samples on a ring, Xmax — n^. 

The effect of the e term in Eq. H28|l is to wash out com- 
ponents with eigenvalues smaller than eXmax, while the 
large eigenvalue component remains unaffected, as long 
as e is small. At the limit e ^ the solution of Eq. H28|l 
approaches that of Eq. H19() . 

We have repeated our computations with this method. 
Table 4 presents our results for different values of e. The 
results were again obtained using 10 different realizations 
of the instrument noise, over which the average and the 
standard deviation were calculated. We see that, with val- 
ues 10~^ < e < 10"'^, the accuracy of fitting Fourier com- 
ponents is strongly improved with respect to the e = 
case. Also the required computation time is reduced. For 
2nd order Legendre polynomials we also find a clear im- 
provement. 

However, the results for multiple base functions are 
still worse than for uniform baselines only. 

Uniform baselines and 1st order Legendre polynomials, 
which exhibited no problems with e = 0, are unaffected 
with e = 10^^ or less, but with e — 10~^ or larger the 
accuracy of fitting them begins to deteriorate. 

Fig. ini shows the residual noise Ct spectra for the e = 
10^'' case. 

Depending on the number of base functions, the code 
took 3-5 seconds per iteration step on one processor of an 
IBM eServer Cluster 1600 computer. The total computa- 
tion time varied between 2 and 30 minutes. 

To illustrate the use of several base functions we show 
in Fig.^lthe same 5 hour coadded noise TOD as in Fig.^ 
but now with different sets of base functions. Note that the 
deviations from uniform baselines are exaggerated in this 
figure. The actual amplitudes of the other base functions 
are much smaller (by about a factor of 20) than those of 
the uniform components. 

In order to check how the results depend on the knee 
frequency, we repeated our computations with rescaled 
noise. We took the 1// noise stream, which was originally 
generated with = 0.1 Hz, scaled it by a factor 0.5 or 2, 
and added white noise with the same variance in all cases. 
This is equivalent to changing the knee frequency by a 
factor of 0.25 or 4. We thus have results for three knee 
frequencies: fk = 0.025Hz, /k = O.lHz, and /k = 0.4Hz. 

The obtained residual map rms for fk = 0.4Hz and 
/k = 0.025Hz are shown in Tables 5 and 6, for differ- 
ent values of e. The optimal value of e seems to depend 
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Table 4. Average rms (in /xK) of the residual noise map, std 
of rms (middle), and number of iteration steps for different sets 
of base functions and for different values of e {fk =0.1 Hz). 
The last line gives the reference rms. Parameter e is defined in 
Eq. I28II . The base functions were: Un: uniform baseline, Fl: 
three Fourier modes, LI (L2): Legendre polynomials up to 1st 
(2nd) order. 



e 


Un. 


Fl 


LI 


L2 


rms 





224.444 


264.131 


224.463 


225.049 


10-^ 


224.444 


251.648 


224.463 


225.047 


10"^ 


224.444 


229.675 


224.463 


225.025 


10"^ 


224.444 


226.034 


224.463 


224.866 


10-* 


224.444 


225.640 


224.463 


224.563 


10"^ 


224.463 


225.438 


224.502 


224.678 


10"^ 


226.192 


230.141 


227.691 


230.396 


ref. 


224.117 


223.621 


223.860 


223.748 


std of rms 





0.073 


70.460 


0.077 


0.463 


10-^ 


0.073 


48.449 


0.077 


0.461 


10-*^ 


0.073 


7.044 


0.077 


0.445 


10"^ 


0.073 


0.450 


0.077 


0.327 


10-* 


0.073 


0.168 


0.077 


0.109 


10"^ 


0.072 


0.196 


0.078 


0.130 


10"^ 


0.230 


0.803 


0.491 


1.311 


Iteration steps 





28 


373 


39 


130 


10-^ 


28 


369 


39 


130 


10-*^ 


28 


351 


39 


130 


10"^ 


28 


318 


39 


128 


10"* 


28 


166 


38 


119 


10^^ 


27 


102 


37 


95 


10"^ 


25 


46 


31 


47 



somewhat on knee frequency, being smaller at higher knee 
frequencies. 

The Ci spectra for e ~ 10^^ for knee frequencies /k = 
O.lHz, /k = 0.4Hz, and /k = 0.025Hz are shown in Figs.El 
El 113 respectively. The std of the Q for the /k = O.lHz 
case are shown in Fig. El 

Looking at the average Ci spectra of residual noise and 
their std we see that fitting additional base functions de- 
creases the accuracy of destriping at low £. However the 
situation for the map rms values in Table 5 seems more 
complicated. For /k — O.lHz or smaller, fitting additional 
base functions does not improve the performance of de- 
striping, but with /k = 0.4Hz fitting one or two Legendre 
polynomials besides the constant baselines decreases the 
map rms, while Fourier components still give inferior re- 
sults. The improvement in the map rms for /k = 0.4Hz, 
when fitting Legendre polynomials, comes from the high 
multipoles. This can be seen from Fig. 1131 where we plot 
the difference between the residual noise C'e obtained when 
fitting Legendre polynomials or Fourier components, and 
when fitting uniform baselines only. 

We see that increasing the number of base functions 
improves the high £ but worsens the low £ part of the Ci 




,3 



Fig. 9. Same as Fig. (HI but for the improved method with 
£ = 10"^. Thick solid line: uniform baselines. Thin solid 
lower line: three Fourier components {q = 0, 1, 2). Dashed 
and dot-dashed lines: Legendre polynomials up to 1st and 
2nd order. The corresponding map rms values are shown 
in Table 4. 




-3000 1 ' ' ' ' ' 1 

3453 3454 3455 3456 3457 

Ring number 

Fig. 10. Same as Fig.^ but now with different sets of base 
functions. The amplitudes of the other base functions were 
much smaller than the uniform components, so that it 
would be difficult to see the small deviation from uniform 
baselines. Therefore we have exaggerated the deviation 
from the uniform-baseline case by a factor of 5 in this 
figure. Dashed line: uniform baselines. White solid line: 
three Fourier components (g — 0,1,2). Black solid lines: 
Legendre polynomials up to 1st and 2nd order. 

spectra. This is true both for Fourier components and for 
Legendre polynomials. This trend persists for lower /k, but 
the value of £ above which we get an improvement goes 
up and the improvement for those £ becomes smaller. 

Delabrouille (TSW) obtained improved results by fit- 
ting several base functions already with /k = O.lHz. 
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Legendre 1 



Legendre 2 



10 10 10 10 

1 

Fig. 11. Same as Fig.lHlbut for knee frequency /k = 0.4Hz. 




Fig. 12. Same as Fig. |51 but for knee frequency /k = 
0.025Hz. 



The difference between our results and his is probably 
due to differences in the noise model. While we assume 
P cx /~^, as appropriate for LFI radiometers (Seiffert 
et al. I2002|l . Delabrouille assumes a noise spectrum of 
the form P cx /^^ to account also for possible thermal 
fluctuations and atmospheric noise in ground based and 
balloon borne bolometer experiments. This leads to more 
low-frequency noise for a given knee frequency. 

The std of the residual noise Ci influences the accuracy 
at which the Ce spectrum of the CMB can be estimated 
from the noisy data. From Fig. El we can see that at 
low £ uniform baselines give the best performance in the 
sense that the Ci of the residual noise varies the least 
from one realization to another. At high £ there is no clear 
difference between the performances of different sets of 
base functions. 




200 400 600 800 1000 

I 

Fourier 
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Fig. 13. Differences between Cf spectra shown in Fig. 1111 
(/k = 0.4Hz). The three panels show the change in the 
spectrum when fitting Legendre polynomials up to a) 1st 
or b) 2nd order, or c) three Fourier components, instead 
of uniform baselines only. 




Fig. 14. Standard deviation (std) of the residual noise Ci 
over 10 noise realizations, for different choices of the base 
functions {fk = 0.1 Hz, e — 10^^). Thick black line: uni- 
form baselines with w = l/up. GreyVme: uniform baselines 
with w — \. Thin solid line: three Fourier components 
{q = 0, 1, 2). Dashed and dot-dashed lines: Legendre poly- 
nomials up to 1st and 2nd order. The corresponding map 
rms and std of map rms values are shown in Table 4. 

5. Conclusions 

We have presented a maximum-likelihood formulation of 
the destriping approach to the CMB map-making prob- 
lem, and a rigorous derivation of the destriping algorithm, 
and we have applied it to the case of the Planck mission. 

We have formulated the method in matrix form, which 
allows us to apply the conjugate gradient technique in such 
a way that we can handle very large data sets. 
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Table 5. Average rms of the residual noise map (in fiK), for 
different sets of base functions and for knee frequency fk = 0.4 
Hz. The base functions were: Un: uniform baseline, Fl: three 
Fourier modes, LI (L2): Legendre polynomials up to 1st (2nd) 
order. The last line gives the reference rms. The corresponding 
Ci spectra for e — 10~* are shown in Fig llll 



e 




Un. 


Fl 


LI 


L2 







234.184 


258.343 


233.947 


234.544 


10" 


-6 


234.184 


238.107 


233.947 


234.512 


10" 


-5 


234.183 


235.603 


233.947 


234.304 


10" 


-4 


234.183 


235.315 


233.947 


233.910 


10" 


3 


234.252 


235.643 


234.102 


234.514 


ref. 




233.337 


231.626 


232.430 


232.045 


e 6. Same as Table 5, but for = 0.025 Hz. The 


hng Ci 


spectra for 


e = 10-* 


are shown 


in Figini 


e 




Un. 


Fl 


LI 


L2 







221.943 


262.149 


222.030 


222.597 


10- 


-5 


221.943 


223.588 


222.030 


222.433 


10" 


-4 


221.943 


223.173 


222.030 


222.163 


10" 


3 


221.948 


222.825 


222.038 


222.149 


10" 


-2 


222.385 


223.676 


222.835 


223.565 


ref. 




221.753 


221.575 


221.665 


221.626 



We have compared the three different destriping meth- 
ods, the one derived here and the other two already pre- 
sented in the hterature, using simulated Planck data 
(one 100 GHz LFI detector). The differences between 
these methods can be expressed in terms of a weight func- 
tion, which varies between methods. This function assigns 
weights to pixels based on the number of observations 
falling on that pixel. 

We found that our new method provides some im- 
provement to the method used in Burigana ct al. (1997) 
and Maino et al. H1999I I^U2fl . However, our new method 
was not better than the method given by Delabrouille 
(|1998|l . although he gives only a heuristic justification for 
his weight function. The difference between the latter two 
methods was insignificantly small, but was systematic. 
That the maximum-likelihood derivation did not lead to 
the optimal method in practice is due to actual noise prop- 
erties differing from the idealization used in the deriva- 
tion. We recommend using either the weight function de- 
rived here (w = l/^-p) or the one given by Delabrouille 
(u; = - 1)). 

We have tested the possibility of improving the accu- 
racy of destriping by fitting more base functions besides 
the uniform baseline, but we have found no systematic im- 
provement in the case of instrumental 1// noise. (Fitting 
several base functions may be more beneficial when remov- 
ing other types of systematics, i.e. periodic fluctuations in- 
duced by thermal instabilities.) The optimal selection of 
base functions seems to depend on the actual spectrum of 
the noise, and on which multipolcs one is mainly interested 
in. However, the great virtue of the destriping method is 



its simplicity: it does not require prior information on the 
noise spectrum. We lose this advantage if we incorporate 
information on the noise spectrum into the method. 
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